BMC Genomics
Top medRxiv preprints most likely to be published in this journal, ranked by match strength.
Show abstract
O_LIMeasuring vector-human contact in a natural setting can inform precise targeting of interventions to interrupt transmission of vector-borne diseases. One approach is to directly match human DNA in vector bloodmeals to the individuals who were bitten using genotype panels of discriminative short tandem repeats (STRs). Existing methods for matching STR profiles in bloodmeals to the people bitten preclude the ability to match most incomplete profiles and multi-source bloodmeals to bitten indivi...
Show abstract
Inefficient aldehyde metabolism by an aldehyde dehydrogenase 2 (ALDH2) genetic variant, ALDH2*2 (rs671), increases the risk of esophageal cancer with alcohol consumption. Here we tested the hypothesis that additional genetic differences in ALDH2 besides ALDH2*2 exist resulting in inefficient acetaldehyde metabolism after alcohol consumption. Human volunteers were recruited who self-reported flushing after alcohol. The first stage recruited East Asians and the second stage non-East Asians. After...
Show abstract
BackgroundDuring the SARS-CoV-2 pandemic, many countries directed substantial resources towards genomic surveillance to detect and track viral variants. There is a debate over how much sequencing effort is necessary in national surveillance programs for SARS-CoV-2 and future pandemic threats. AimWe aimed to investigate the effect of reduced sequencing on surveillance outcomes in a large genomic dataset from Switzerland, comprising more than 143k sequences. MethodsWe employed a uniform downsamp...
Show abstract
Early detection of human disease is associated with improved clinical outcomes. However, many diseases are often detected at an advanced, symptomatic stage where patients are past efficacious treatment periods and can result in less favorable outcomes. Therefore, methods that can accurately detect human disease at a presymptomatic stage are urgently needed. Here, we introduce "frequentmers"; short sequences that are specific and recurrently observed in either patient or healthy control samples, ...
Show abstract
1IntroductionWhole exome sequencing (WES) has become a more accessible diagnostic tool in clinical genetic context, leading to the debate of the most accurate and effective bioinformatic pipeline solutions to evaluate variants that explain diseases. ObjectiveThis study aimed to evaluate twenty-four pipelines in two samples comparing accuracy, time and computing efficiency. We also contrasted the results based on regions in two of the most common capture kits. Materials and methodsWe used two a...
Show abstract
BackgroundTaiwan Biobank (TWB) project has built a nationwide database to facilitate the basic and clinical collaboration within the island and internationally, which is one of the valuable public datasets of the East Asian population. This study provided comprehensive genomic medicine findings from 1,496 WGS data from TWB. MethodsWe reanalyzed 1,496 Illumina-based whole genome sequences (WGS) of Taiwanese participants with at least 30X depth of coverage by Sentieon DNAscope, a precisionFDA cha...
Show abstract
BackgroundMeasuring and estimating alcohol consumption (AC) is important for individual health, public health, and Societal benefits. While self-report and diagnostic interviews are commonly used, incorporating biological-based indices can offer a complementary approach. MethodsWe evaluate machine learning (ML) based predictions of AC using blood and urine-derived biomarkers. This research has been conducted using the UK Biobank (UKB) Resource. In addition to the prediction of the number of alc...
Show abstract
2.Plasmids are extrachromosomal mobile genetic elements that often carry genes responsible for antimicrobial resistance. Plasmid epidemiology aims to track the evolution and spread of plasmids, but the field currently faces significant barriers that make practical implementation using whole genome sequence data difficult. Hybrid-assembled genomes remain the most reliable way to identify and track complete plasmids; however, most genomic surveillance data exists in the form of short-read sequenci...
Show abstract
Alcohol dependence and cirrhosis are key outcomes of excessive alcohol use. We studied the interaction between genetics and epigenetics at the aldehyde dehydrogenase (ALDH2) locus to understand differences in vulnerability to cirrhosis. Individuals were selected according to ICD 10 criteria for Alcohol dependence with Cirrhosis (AUDC+ve, N=116) and Alcohol dependence but without Cirrhosis; (AUDC-ve, N=123) from the clinical services of Gastroenterology and Psychiatry at the St Johns Medical Coll...
Show abstract
Rabies virus (RABV), a fatal zoonotic pathogen, remains a significant public health concern, with bat-maintained lineages accounting for all currently documented cases in Brazil. Despite the availability of pharmacological prophylaxis for humans and animals, the high genetic diversity of RABV in diverse natural bat hosts and continued circulation in multiple animals pose challenges for effective surveillance. Here, we developed and validated a novel, rapidly deployable amplicon-based sequencing ...
Show abstract
Quantitatively understanding local transmission dynamics is essential for designing effective prevention strategies. In this study, we developed a novel algorithm to identify introductions and trace locally circulating clusters. We analyzed over 26,000 SARS-CoV-2 genomes and their associated metadata, collected between January and October 2021, to explore introduction and dispersal patterns in Greater Houston, a major metropolitan area known for its demographic diversity. Our analysis identified...
Show abstract
As the SARS-CoV-2 virus mutates, mutations harboured in patients become increasingly diverse. Patients classified into two strains may have overlapping non-variant-defining mutations. Mutation calling by sequencing is relative to a reference genome. As SARS-CoV-2 mutates, tracking emerging mutant strains may become increasingly problematic if the reference genome remains Wuhan-Hu-1, because the comparison then becomes indirect: current dominant strain relative to Wuhan-Hu-1 versus emerging stra...
Show abstract
Emerging fungal pathogens, such as Coccidioides, the causative agent of Valley fever, pose significant clinical and public health challenges. While advances in genomic epidemiology have enhanced our understanding of Coccidioides evolutionary history, the lack of standardized tools for variant identification makes it difficult to draw comparisons between studies. To address this gap, we developed and benchmarked a novel, publicly available pipeline, cocci-call, designed for genome-wide variant id...
Show abstract
The scale of data produced during the SARS-CoV-2 pandemic has been unprecedented, with more than 5 million sequences shared publicly at the time of writing. This wealth of sequence data provides important context for interpreting local outbreaks. However, placing sequences of interest into national and international context is difficult given the size of the global dataset. Often outbreak investigations and genomic surveillance efforts require running similar analyses again and again on the late...
Show abstract
The modern response to pandemics, critical for effective public health measures, is shaped by the availability and integration of diverse epidemiological outbreak data. Genomic surveillance has come to the forefront during the coronavirus disease 2019 (COVID-19) pandemic at both local and global scales to identify variants of concern. Tracking variants of concern (VOC) is integral to understanding the evolution of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) in space and time. Co...
Show abstract
ImportanceHigh myopia (HM) is one of the leading causes of visual impairment and blindness worldwide. It is well-known that genetic factors play a significant role in the development of HM. Early school-aged population-based genetic screening and treatment should be performed to reduce HM complications. ObjectiveTo identify risk variants in a large HM cohort and to examine the implications of universal genetic testing of individuals with HM with respect to clinical decision-making. Design, set...
Show abstract
BackgroundTo study how clinical and genetic factors control the effectiveness of orthokeratology lenses in myopia. MethodsIn this study, we conducted a retrospective clinical study of 545 children aged 8-12 years with myopia who were wearing orthokeratology lenses for one year and performed whole-genome sequencing (WGS) for 60 participants in two groups, one with rapid axial length progression of larger than 0.33 mm and the other with slow axial length progression of less than 0.09 mm. Genes in...
Show abstract
PurposeTo investigate the genetic relationships between primary open-angle glaucoma (POAG) and major visual pathways in the brain to better understand the neurological biology of glaucoma, which may facilitate the discovery of neuroprotective drug targets. MethodsWe assessed the relationship between POAG and the volumes of five visual pathway regions using genetic correlation and polygenic risk score (PRS). We further used Mendelian randomisation (MR) to investigate the causal relationships. In...
Show abstract
Four members of a three-generation family with early-onset chorioretinal dystrophy were shown to be heterozygous carriers of the n.37C>T in MIR204. The identification of this previously reported pathogenic variant confirms the existence of a distinct clinical entity caused by a sequence change in MIR204. The chorioretinal dystrophy was variably associated with iris coloboma, congenital glaucoma, and premature cataracts extending the phenotypic range of the condition. In silico analysis of the n....
Show abstract
The contribution of common tandem repeats (TR) variants to common, complex disease remains unknown, especially in populations historically underrepresented in genetic research. We identified common TR variants associated with risk of primary open-angle glaucoma (POAG) in individuals of African ancestry. The POAG-associated TR variants were predominantly found at Alu poly(A) tail elements, regions, retinal development enhancers, and harbor binding sites of a POAG-associated transcription factor, ...